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Introductions 


Tomcat since 2003 
Committer, PMC member 


Commons (Daemon, Pool, DBCP, BCEL) 
Committer, PMC member 


ASF member , ASF security team, ASF infrastructure team, Director 2016 to 2019 
VP, Brand Management since 2018 


Java EE Expert groups for Servlet, WebSocket, Expression Language 


Jakarta Serviet, Pages, WebSocket and Expression Language 
Committer 
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“Project Loom is to intended 
to explore, incubate and 
deliver Java VM features and 
APIs built on top of them for 
the purpose of supporting 
easy-to-use, high-throughput 
lightweight concurrency and 
new programming models on 
the Java platform.” 


https://wiki.openjdk.org/display/loom 
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A Brief History of Servlet Scalability 


A Brief History of Servlet Scalability 


HTTP/1.0 

HTTP/1.1 and keep-alive 

Tomcat, blocking 1/0 (BIO, 3.x) and thread starvation 
Tomcat, non-blocking I/O (NIO, 6.0.x / NIO2, 8.0.x) 
Servlet asynchronous API and non-blocking I/O (7.0.x) 
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A Brief History of Servlet Scalability 
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A Brief History of Servlet Scalability 
HTTP/1.0 
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Connect, make request, close 
One thread per connection 


Maximum connections 


Maximum concurrent requests 


Thread pool size 


Creating connections is 
(relatively) expensive 


A Brief History of Servlet Scalability 
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HTTP/1.0 had keep-alive with 
issues with interoperability 


HTTP/1. fixed the issues 
Better (lower) latency 
Worse scalability 


Typically uses more threads than 
there are concurrent requests 


Thread starvation 


Tomcat BIO connector disabled 
HTTP keep-alive for the last 25% 
of threads in the thread pool 


A Brief History of Servlet Scalability 


Non-blocking I/O part 1 - between requests 


p4 
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Service 


Tomcat NIO / NIO2 connectors 


Use non-blocking I/O while 
waiting for a new request 


Only use a thread for 
connections where there is a 
request to be processed 


Maximum connections 
>> 


Maximum concurrent requests 


Thread pool size 
HTTP keep-alive latency benefits 


Improved scalability 


A Brief History of Servlet Scalability 
Non-blocking I/O part 2 - Servlet asynchronous API 
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Use non-blocking I/O to 
communicate with services 


connections where there is a 


Maximum concurrent requests 


Further improved scalability 


rocessed 


Virtual Threads 


Virtual Threads 


Pre-Java 21 threads referred to as platform threads 
Virtual threads 
e Not mapped to a dedicated OS thread 
e Use the heap for stack 
e Created for a task and then allowed to terminate 
e Do not pool virtual threads 
e Have their own scheduler 
Virtual thread scheduler has a pool of platform threads to do the work 


e One platform thread per processor by default 
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Virtual Threads 


Blocking operations 


Platform threads 
e Thread waits for operation to complete 
Virtual threads 
e Non-blocking operation started 
e Virtual thread suspended and platform thread released 
e Operations completes 
e Virtual thread resumed and becomes eligible to be scheduled 
e Execution continues 
Virtual threads are effectively non-blocking for many blocking operations 


e Increased scalability for “free” 
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Virtual Threads 


Coding constraints 


Beware of pinning 


e Long lasting blocking operations are problematic 
e Brief synchronized blocks are fine 


ThreadLocals 


e Providing context across an API boundary OK 


e Caching could be problematic 
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A Brief History of Servlet Scalability 


Virtual threads 
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Service 


Impact on throughput? 
Impact on scalability? 


Impact on GC? 


Impact on memory footprint? 
Impact of extra scheduler? 


Impact on code complexity? 


Impact of constraints? 


Investigations 


Lots of areas to explore 

Areas are not independent 

Try and focus on a single variable 

Performance tests only ever indicative 

Not meant to be representative of real applications 


This work is just a starting point 


mwa re © VMware, Inc. 


Throughput 


Throughput 


Aims: 

e compare virtual and platform threads in same scenario 

e minimise impact of other factors 

e not looking to identify maximums 

e relative, rather than absolute, results were primary interest 
Examined: 

e Different sized requests 

e Different concurrencies 

e Configured to minimise Tomcat and web application processing time 


e Details at https://spring.io/blog/2023/02/27/web-applications-and-project-loom 
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Throughput 


vmware 
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Loom (0.25 kB) 
Thread Pool (0.25 kB) 
—e— Loom (1 kB) 
—— Thread Pool (1 kB) 
—— Loom (4 kB) 
== Thread Pool (4 kB) 
Loom (16 kB) 
Thread Pool (16 kB) 
Loom (64 kB) 
Thread Pool (64 kB) 
—— Loom (256 kB) 
—<— Thread Pool (256 kB) 
Loom (1024 kB) 


Thread Pool (1024 kB) 
=@— Loom (4096 kB) 
—— Thread Pool (4096 kB) 
~~ Loom (16384 kB) 
»*— Thread Pool (16384 kB) 


The bigger the response size, the less the difference 


Platform thread performance is worse with 
concurrency of 2 than it is with 1 


Virtual threads have higher throughput and this is 
more obvious with smaller response sizes 


Once concurrency exceeds processor count, virtual 
threads show increased throughput compared to 
platform threads 


Tomcat’s thread pool uses LinkedBlockingQueue for 
the task queue by default. 


The virtual thread scheduler uses a work stealing 
queue by default. 


Throughput 


Bonus results 


These results are from some informal testing 
e Much higher concurrency than my tests (8192 concurrent users) 
e Any errors are my fault 
e Any credit is due to Violeta Georgieva 
Platform threads 
+ 3,366,303 requests with 100% within 800ms 
Virtual threads 
e 3,408,798 requests with 89% complete within 800ms 
e 9% complete between 800ms and 1200ms 


e 2% complete in more than 1200ms 
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Easy to Use 


Servlet Blocking IO 
Counting request body bytes 


protected void doPost(HttpServietRequest req, HttpServietResponse resp) 
throws ServletException, IOException { 
resp.setContentType("text/plain"); 
resp.setCharacterEncoding("UTF-8"); 
ServletinputStream sis = req.getInputStream(); 
byte[] buffer = new byte[8192]; 
int read = -1; 
int totalBytesRead = O; 
while ((read = sis.read(buffer)) > -1) { 
if (read > O) { 
totalBytesRead += read; 
} 
} 
ServletOutputStream sos = resp.getOutputStream(); 
String msg = "Total bytes written = [" + totalBytesRead + "]"; 
sos.write(msg.getBytes(StandardCharsets.UTF_8)); 
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Servlet Non-blocking IO 
Counting request body bytes 


protected void doPost(HttpServietRequest req, HttoServletResponse resp) 
throws ServletException, IOException { 
resp.setContentType("text/plain"); 
resp.setCharacterEncoding("UTF-8"); 
AsyncContext ac = req.startAsync(); 
CounterListener listener = 
new CounterListener(ac, req.getInoutStream(), resp.getOutputStream()); 


private static class CounterListener implements ReadListener, WriteListener { 
private final AsyncContext ac; 
private final ServletinputStream sis; 
private final ServletOutputStream sos; 
private volatile boolean readFinished = false; 
private volatile long totalBytesRead = O; 
private byte[] buffer = new byte[8192]; 


private CounterListener(AsyncContext ac, ServletinputStream sis, 
ServletOutputStream sos) { 
this.ac = ac; 
this.sis = sis; 
this.sos = SOS; 
sis.setReadListener(this); 
sos.setWriteListener(this); 
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} 


public void onDataAvailable() throws IOException { 
int read = O; 
while (sis.isReady() && read > -1) { 
read = sis.read(buffer); 
if (read > O) { 
totalBytesRead += read; 
} 
} 
} 


public void onAllDataRead() throws IOException { 
readFinished = true; 
if (sos.isReady()) { 
onWritePossible(); 
} 
} 


public void onWritePossible() throws IOException { 
if (readFinished) { 
String msg = "Total bytes written = [" + totalBytesRead + "]"; 
sos.write(msg.getBytes(StandardCharsets.UTF_8)); 
ac.complete(); 
} 
} 


public void onError(Throwable throwable) { 
ac.complete(); 


} 
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Easy to Use 


Aims: 
e Compare virtual threads with blocking code to thread pool with non-blocking 
e Minimise other factors 

Examined 
e External service that blocked and waited a preset time before continuing 


e Service ‘delay’ dominated initial results 
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Easy to Use 


Virtual threads generally a little more performant 


Difference more noticeable at low concurrency and 
1000000 when concurrency exceeds processor cores 


Performance of blocking code with virtual threads is 
comparable to refactoring to use non-blocking APIs 


100000 
== Thread pool 
Virtual threads 
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Coding constraints 


Pinning 


Detect it with -Djdk.tracePinnedThreads=[full|short] 
Logs issues to stdout as they are detected 
Tomcat experience 
e Configured unit tests to run with this detection enabled 
e Identified a handful of issues in HTTP/2 
Can replace synchronized with ReentrantLock 
e Need to be careful to ensure lock is released 


e Make sure all uses of synchronized are replaced for a given object 
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Pinning 


Replace 
SVNehroniZedes( lock ia, 


} 
With 


Lock lock = new ReentrantLock(): 
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} 
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Threadlocal alternatives 


Tomcat uses ThreadLocal for thread-safe caching of objects that are expensive to create 


e Matchers in RewriteValve 
e RequestDispatcher request mapping 
e etc 
Options to implement this with virtual threads 
e No change, continue to use ThreadLocal 
e Always create a new Object 


e Cache using SynchronizedStack (or similar) 
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ThreadLocal alternatives 


ThreadLocal 


e Should be slower than new Object for virtual threads 


new Object 
e Lose the benefit of caching 
e Is caching still required? 
SynchronizedStack 


e May be slower under high concurrency 
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ThreadLocal alternatives 


Ran a series of tests 
e Non-dispatching request / dispatching request 
e 8,16, 32 & 64 concurrent users (machine under test has 20 cores) 
e new Object() / SynchronizedStack / ThreadLocal 
e Platform threads / Virtual Threads 
Ran each combination for 11 runs of 60 seconds 
e Dropped the first result (warm-up) 
e Took average of remaining 10 
Results were inconclusive 


e No clear winner or loser 
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Conclusions 
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Conclusions 


Applications currently using non-blocking APIs will likely see minimal differences with 
virtual threads 


Applications currently using blocking APIs 
e will likely see minimal throughput differences with virtual threads 


e will likely see measurable scalability improvements with virtual threads 


Code changes may be required for: 
e long lasting blocking operations 


e ThreadLocals 
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Next Steps 


Next Steps 


Tomcat included virtual thread support in the June 2023 releases 
Tomcat 11 
e Requires a minimum of Java 21 
Tomcat 8.5, 9.0 & 10.1 
e No change to minimum Java versions 
e Required Java 21 to use virtual threads 
Future Tomcat development 


e Investigate bottlenecks as they get reported 
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Questions... 
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Stay Connected SpringOne 


Discuss all things Tomcat @ 
httos://tomcat.apache.org/lists. htm! 


Visit @ https://tomcat.apache.org 


httos://github.com/apache/tomcat 


For a discussion this week 
markt@apache.org 
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Thank you 


TE 


